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Abstract 

We introduce a transformation of finite integer sequences, show that every se- 
quence eventually stabilizes under this transformation and that the number of fixed 
points is counted by the Catalan numbers. The sequences that are fixed are precisely 
those that describe themselves — every term t is equal to the number of previous 
terms that are smaller than t. In addition, we provide an easy way to enumerate all 
these self-describing sequences by organizing them in a Catalan tree with a specific 
labelling system. 

Prefix ordered sequences and rooted labelled trees 

The following connection between prefix ordered sequences and rooted labelled trees is 
well known and we briefly mention only the instance which is useful for our considerations. 

Let A be the set of finite integer sequences a = (oq, ai, . . . ) with the property that 
< tti < i, for all indices. We order the sequences in A by the prefix relation, i.e., 

(ao,ai, . . . ,a„) ^ (&o,&i, • • • , &m) 

if n < m and = bi, for i = 0, . . . ,n. The sequences in A can be organized in a rooted 
labelled tree T which reflects the prefix order relation. The root of the tree T is labelled 
by 0. Every vertex that is at distance n from the root has n + 2 children labelled by 
0,1, . . . ,n,n + l (see Figure^). The vertices whose distance to the root is n form the n-th 
level of the tree T, which is also called the n-th generation. For every vertex v at the 
level n in the tree T there exist a unique path of length n from the root to v. The labels 
of the vertices on this path form a unique sequence (ao, ai, . . . , a„) in A that corresponds 
to the vertex v and this sequence is called the full name of v. The correspondence 

V the full name of v 

provides a bijection between the vertices in T and the sequences in A. Under this bijection, 
the vertices from the n-th generation in T correspond to the sequences of length n + 1 in 
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Figure 1: The rooted labelled tree T up to the third generation 

A. The set of vertices in the n-th generation is denoted by 7^ and the corresponding set 
of sequences by An- 

The sequence a — (oq, oi, . . . , a^) is a prefix of the sequence b = {bo, bi, . . . , bm) if and 
only if the vertex Va with full name a is on the unique path between the root and the vertex 
with full name h. i.e., if and only if the vertex Va is an ancestor of the vertex f^. Consider 
a graph endomorphism a of T that fixes the root (and therefore also preserves the levels). 
Such an endomorphism corresponds to a transformation of sequences a : A ^ A that 
preserves the length of the sequences and also their prefix order, i.e., 

a ^b implies aa ^ ab, 

for all sequences a and b in A. 

In the sequel, we often deliberately blur the distinction between the vertices in T and 
the corresponding sequences in A. Similarly, we do not distinguish tree endomorphisms 
of T fixing the root from sequence transformations that preserve the length and the prefix 
order. This mistake actually improves our presentation. 

Let a be an endomorphism of T. Since every generation in T is finite, the a orbit 

a*u — {a^u I i > } 

of every vertex u of T is finite. Thus, starting from any vertex, repeated applications of 
a produce periodic points, i.e., points a for which a''a — a for some A; > 0. The period 
of the periodic point a is the smallest k for which a^a = a. The points of period 1 are 
fixed points and the points of period dividing 2 are double points. Obviously, if u and v 
are periodic points of a and uis a. prefix of v then the period of u divides the period of v. 

It is easy sometimes to estimate how long it takes before a periodic point is reached. 
We make use of the lexicographic ordering < of the sequences in An (note the difference 
with the prefix ordering ^). Namely, for a = {oq, ai, . . . , a„) and b — {bo, bi, . . . , bn), set 
a < 6 if Oj < 6j at the first index where a and b differ. 
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Theorem 1. Let a he an endomorphism of the tree T and assume that, for some n > 1, 
there exists k > 1 such that, for every vertex u in generation n, either 

u < a^u < a^^u < . . . 

or 

u > c/u > a^^u > 

Then, starting from any point in generation n, repeated applications of a lead to a periodic 
point of period dividing k in 0{n^) steps. 

Proof. We show that j3 — a'^ reaches a fixed point in no more than 

1 + 2H hn = + l)/2 

steps. 

Start with any vertex u in generation n. Without loss of generahty we may assume 

u < I3u < li'^u < . . . . 

After the first apphcation of (5 the initial segment up to index 1 of (5u is fixed under (5. 
After the next two steps the entry at index 2 will be fixed. Proceeding in the same fashion 
we see that the initial segment of up to index i is fixed under (3. Indeed, once 

the initial segment up to index i — 1 is fixed the entry at index i can go up no more than 
i times (from to i) before it stabilizes. Thus, ' is fixed under {3. □ 

Self-describing sequences 

We define an endomorphism 5 : A transforming sequences in A by 

{5a)i = #{j I i < i, aj < tti}. 

Thus, for each term t in the sequence a, {5a)i counts the number of previous terms that 
are smaller than t. The transformation 5 makes perfect sense even for sequences out of 
A, but the image is in A and it stays there under further iterations. A sequence that is 
fixed under 5 is called a self- describing sequence. Therefore, the sequence a — (oq, oi, . . . ) 
is self-describing if 

#{j \ 3 <h «i < «i} = «i, 

for all indices, i.e., every term t is equal to the number of previous terms that are smaller 
than t. 

The Catalan family tree 

We describe now a rooted labelled subtree of T, denoted by C and called the Catalan 
family tree or just the Catalan family. The root vertex belongs to C. It has two children 
named and 1 and we consider the older sibhng. The oldest sibling in this family always 
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has 2 children, the second oldest 3, the third oldest 4, and so on. The oldest child of a 
member of the family x gets named after the oldest sibling of x, the second oldest child 
after the second oldest sibling, and so on, until x uses its own name for its second to last 
child and n for the youngest one, where n is the generation number of the children (the 
level in the tree). The diagram in Figure |21 depicts the family members of C up to the 
third generation. 




Figure 2: The Catalan family tree C up to the third generation 



The connection 

We establish now a connection between the self-describing sequences and the Catalan 
family tree. 

Theorem 2. The full names of the members of the Catalan family are precisely the self- 
describing sequences. In other words, they are the fixed points of the endomorphism 6. 

Moreover, repeated applications of 6 to any sequence in A eventually produce a member 
of the Catalan family, i.e. a fixed point of 6. The number of applications needed to reach 
such a point is 0{n'^). 

All statements of the theorem are implied by Theorem^ and the following lemma. 

Lemma 1. If a is a member of the Catalan family then a = 6a. Otherwise, a < 6a. 

Proof. The proof is by induction on the generation number n. The statement is true 
for n = and n = 1. Assume that the statement is true for all vertices up to the n-th 
generation. 
Let 

be a (n + l)-st generation member of the Catalan family. We consider two cases. 
If X = n + 1 then 

#{i I J < ''^ + 1) ttj < x} = #{j I j < n + 1, ttj < n + 1} = n + 1 = X, 
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and a is a fixed point of 5. 

If X ^ n + 1, then a„ > x and there exists an n-th generation member of the Catalan 
family whose full name is 

= (flO) Ol, • • • , CLn-l, x), 

namely the one after whom a was named. We have 

#{j \j<n+l, Qj <x}^ #{j \j<n, Qj <x}^x, 

where the first equality comes from the fact that an > x and the second from the inductive 
hypothesis, since 5a' — a'. 

Thus all members of the Catalan family are fixed under S. 

Now, let 

a= (ao,ai,...,a„,x) 

be a full name of a vertex in T in the n-th generation that is not a member of the Catalan 
family C. If any proper prefix of a is not in C we obtain the claim directly from the 
inductive hypothesis. Thus we may assume that 

a" = (ao, oi, . . . , On) 

is a member of the Catalan family. Since a is not in C we have an ^ x and n+1 ^ x. We 
consider two cases. 

If a„ > X then a' — (oq, oi, . . . , a„_i, x) is not in C and 

#{i \j<n+l, aj <x} = I i < n, aj < x} > x, 

where the equality comes from the fact that a„ > x and the inequality from the inductive 
hypothesis. 

If a„ < a; < n + 1 then 

#{i I i < + 1, aj <x}^ #{j I J < n, aj <x} + l>x + l, 

where the equality comes from the fact that an < x and the inequality from the inductive 
hypothesis. The equality in the last case is possible only when a' — (ao, ai, . . . , a^-i, x) is 
in C. □ 

We proceed by counting the self-describing sequences with fixed length. In addition, 
we obtain a result on the distribution of names in C. Recall that the n-th Catalan number 
is equal to 

1 /2n' 
n + 1 \ n 

A recursive definition of the Catalan numbers is given by 

Co = 1, 

C„+i = CoC„ CiCn-l H h C„Co. 
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Theorem 3. The number of self- describing sequences in An, i-C, the number of n-th 
generation members of the Catalan family is the (n + 1) — th Catalan number Cn+i- 

Moreover, for r = 0, . . . ,n, the number of n-th generation members of the Catalan 
family whose name is r is equal to CrCn-r- 

Proof. Denote by Zn the number of n-th generation members of the Catalan family whose 
name is 0. More generally, for r = 0, . . . ,n denote by fn,r the number of n-th generation 
members of the Catalan family whose name is r. Finally, denote by gn the number of 
n-th generation members of the Catalan family. 

Since the oldest child of every member of the Catalan family is named 0, we have, for 
all n. 

Since the youngest sibling in the r-th generation is always named r and the oldest 
we also have, for all r, 

fr,r fr,0 ^r- 

For some fixed r, consider the set of fr^r '"-th generation members named r together 
with all their descendants in C whose names are greater or equal to r. This forest of fr^r 
identical subtrees of C contains all members of C whose name is r. Moreover, each tree in 
this forest looks exactly like the Catalan family tree, except that all labels are increased 
by r. Indeed, each r-th generation member of C named r has two children, named r and 
r + 1, the oldest sibling always has two children, the second oldest three, etc. Thus, for 
any n and r = 0, . . . , n, the number /„ of n-th generation members of C named r is fr,r 
times larger than the number of (n — r)-th generation members of C named 0, i.e., 

fn,r fr,rfn—r,0 Zj-Zji—r- 

Since zq = 1 and 

Zn+l = 9n = fnfl + fn,l + " " " + fn,n 
= ZqZu + ZiZn-l + ■ ■ ■ + ZnZQ 

we conclude that, for all n, Zn is the n — th Catalan number. The statements of the 
theorem follow now easily from the relations gn = Zn+i and fn,r = z^Zn-r- CH 

Connection to other Catalan trees and objects 

It is well known that the Catalan numbers appear naturally under many circumstances. 
The exercises on Catalan numbers in |Sta99j provide a trove of examples, along with 
references, in which Catalan numbers count the number of objects of particular type and 
size. The self-describing sequences provide yet another example that we now relate to 
some other objects counted by the Catalan numbers. 

Consider the sequences in A with the property that Oj+i < + 1, for all indices (see 
the Exercise 6.19.U in jSta99j ). Such sequences are called sequences with unit increase. 



6 



The rooted labelled tree that corresponds to the set of sequences with unit increase looks 
the same as the Catalan family tree, just with a different labelling and we obtain an easy 
bijective correspondence between the self-describing sequences and the sequences with 
unit increase. We could use this bijective connection to show that the Catalan numbers 
count the number of self-describing sequences. Instead, we provided a direct proof of 
Theorem El and the reason is that there is an important difference in the distribution of 
labels in the Catalan family tree and the tree of the sequences with unit increase. 

Theorem 4. For r = 0, . . . ,n, the number of n-th generation vertices in the tree of 
sequences with unit increase labelled by r is 



Proof Let ) be a sequence with unit increase. Following Exercise 6.19.U 

in [Sta99j . we define, for i = 0, . . . , n — 1, 



Construct a sequence of n I's and n — a„ negative I's by replacing each 6j,i = 0,...,n— 1 
by one 1 followed by hi negative I's. The newly obtained sequence has non-negative partial 
sums. The correspondence between the sequences in An with unit increase that end by 
r and the sequences of n I's and n — r negative I's with non-negative partial sums is 
bijective. It is shown in |Bai96j that the number of sequences with non-negative partial 
sums that consist of n I's and k negative I's is equal to 



In passing, we make a slightly more general remark. Namely, for a fixed positive integer 
m, consider the sequences with the property that ao = and < a^+i < + m, for all 
indices. Such sequences are called sequences with m-increase. We can easily construct the 
rooted labelled tree that corresponds to such sequences. For a sequence (ao,ai, . . . , a„) 
with m-increase, define, for i = 0, . . . ,n — 1, 



Following the same approach as before, construct a sequence of n m's and n — an negative 
I's by replacing each 6j, i = 0, . . . ,n — 1 by one m followed by 6j negative I's. The 
newly obtained sequence has non-negative partial sums and the correspondence between 
the sequences (ao,ai, . . . , a„) with m-increase that end by r and the sequences of n I's 
and mn — r negative I's with non-negative partial sums is bijective. Such sequences 
are discussed in |FS01j, where simple recursive formulae for their number is provided. 




bi — cii — Oj+i + 1. 




and this implies our claim. 



□ 
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Unfortunately, closed formulae are not provided yet, but we note that the number of n-th 
generation sequences with m-increase is given by Cm(n + 1) where 

1 f (m + l)n 

Cm[n) 



mn + 1 V n 

The last displayed number is the generalization of the Catalan numbers which counts, for 
example, the number of rooted (m + l)-ary trees with n interior vertices. 

It is worth nothing that Julian West |Wes95j recursively constructs a rooted labelled 
tree whose root is labelled by 2 and each vertex labelled by x has x children labelled by 
2, 3, . . . , x + 1. This tree, which West calls a Catalan tree, looks again exactly like the 
Catalan family tree, but with different labels. In fact, the tree of the sequences with unit 
increase can be obtained from the Catalan tree constructed by Julian West by decreasing 
all labels by 2. 

Similarly, in the spirit of the Julian West construction, for any positive integer m, 
construct a rooted labelled tree whose root is labelled by m + 1 and each vertex labelled 
by X has x children labelled by m + l,m + 2,...,m + x. The tree of sequences with 
m-increase can be obtained from this tree by decreasing all labels by m + 1. 

Mirror symmetry and mutually describing sequences 

We introduce another endomorphism : A ^ A transforming sequences in A by 

{.lo)i = #{j \j<i, aj > ai}. 
Clearly = fj,6 where fi is the mirror involution of A given by 

(/ia)i = i - ai. 

We call fi the mirror involution of A since n mirrors the tree T through its vertical axis 
of symmetry. 



The endomorphism 7 is studied in |Sun02j. Clearly, 7 has no fixed points other than 



the sequence (0). However, 7 has a lot of double points. If a is a double point of 7 then 
so is 6 = 7a. Moreover, then 76 = a and the sequences a and b mutually describe each 
other. 

Theorem 5 ( jSun02) ). Repeated applications 0/7 to any sequence in A eventually pro- 
duce a double point of'j. The number of application needed to reach a double point in An 
is 0{n'^) and there are more than 2" such points. 

The sequence that counts the number of double points of 7 in the n-th generation 
starts as follows 

1,2,4,10,26,70,216,... 

This sequence does not appear in the Encyclopedia of Integer Sequences jSP95j nor in 
the online version |Sloj as of January 2002. It is interesting that we have such a good 
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understanding of the fixed points of S, via the Catalan family tree, but we are still not 
able to count the number of double points of the mirror related endomorphism 7 = n6. 



Some other endomorphisms leading to fixed or double points are studied in |Sun02 
For one of them, the set of double points of length n is in bijective correspondence with 
the Young tableaux of size n. 

Acknowledgements 

Thanks to Richard Stanley and Louis Shapiro for their interest and input. 



References 

[Bai96] D. F. Bailey, Counting arrangements ofl's and —I's, Math. Mag. 69 (1996), 
no. 2, 128-131. 

[FSOl] Darrin D. Frey and James A. Sellers, Generalizing Bailey's generalization of the 
Catalan numbers, Fibonacci Quart. 39 (2001), no. 2, 142-148. 

[Slo] N. J. A. Sloane, http://www.research.att.com/~njas/sequences/. 

[SP95] N. J. A. Sloane and Simon Plouffe, The encyclopedia of integer sequences, Aca- 
demic Press Inc., San Diego, CA, 1995. 

[Sta99] Richard P. Stanley, Enumerative combinatorics. Vol. 2, Cambridge University 
Press, Cambridge, 1999, With a foreword by Gian-Carlo Rota and appendix 1 
by Sergey Fomin. 

[Sun02] Zoran Sunik, Young tableaux and other mutually describing sequences. Journal 
of Integer Sequences 5 (2002), no. 1, Article 02.1.5. 

[Wes95] Julian West, Generating trees and the Catalan and Schroder numbers. Discrete 
Math. 146 (1995), no. 1-3, 247-262. 



9 



